Search Results: "bs"

3 April 2024

Arnaud Rebillout: Firefox: Moving from the Debian package to the Flatpak app (long-term?)

First, thanks to Samuel Henrique for giving notice of recent Firefox CVEs in Debian testing/unstable. At the time I didn't want to upgrade my system (Debian Sid) due to the ongoing t64 transition transition, so I decided I could install the Firefox Flatpak app instead, and why not stick to it long-term? This blog post details all the steps, if ever others want to go the same road. Flatpak Installation Disclaimer: this section is hardly anything more than a copy/paste of the official documentation, and with time it will get outdated, so you'd better follow the official doc. First thing first, let's install Flatpak:
$ sudo apt update
$ sudo apt install flatpak
Then the next step is to add the Flathub remote repository, from where we'll get our Flatpak applications:
$ flatpak remote-add --if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo
And that's all there is to it! Now come the optional steps. For GNOME and KDE users, you might want to install a plugin for the software manager specific to your desktop, so that it can support and manage Flatpak apps:
$ which -s gnome-software  && sudo apt install gnome-software-plugin-flatpak
$ which -s plasma-discover && sudo apt install plasma-discover-backend-flatpak
And here's an additional check you can do, as it's something that did bite me in the past: missing xdg-portal-* packages, that are required for Flatpak applications to communicate with the desktop environment. Just to be sure, you can check the output of apt search '^xdg-desktop-portal' to see what's available, and compare with the output of dpkg -l grep xdg-desktop-portal. As you can see, if you're a GNOME or KDE user, there's a portal backend for you, and it should be installed. For reference, this is what I have on my GNOME desktop at the moment:
$ dpkg -l   grep xdg-desktop-portal   awk ' print $2 '
xdg-desktop-portal
xdg-desktop-portal-gnome
xdg-desktop-portal-gtk
Install the Firefox Flatpak app This is trivial, but still, there's a question I've always asked myself: should I install applications system-wide (aka. flatpak --system, the default) or per-user (aka. flatpak --user)? Turns out, this questions is answered in the Flatpak documentation:
Flatpak commands are run system-wide by default. If you are installing applications for day-to-day usage, it is recommended to stick with this default behavior.
Armed with this new knowledge, let's install the Firefox app:
$ flatpak install flathub org.mozilla.firefox
And that's about it! We can give it a go already:
$ flatpak run org.mozilla.firefox
Data migration At this point, running Firefox via Flatpak gives me an "empty" Firefox. That's not what I want, instead I want my usual Firefox, with a gazillion of tabs already opened, a few extensions, bookmarks and so on. As it turns out, Mozilla provides a brief doc for data migration, and it's as simple as moving Firefox data directory around! To clarify, we'll be copying data: Make sure that all Firefox instances are closed, then proceed:
# BEWARE! Below I'm erasing data!
$ rm -fr ~/.var/app/org.mozilla.firefox/.mozilla/firefox/
$ cp -a ~/.mozilla/firefox/ ~/.var/app/org.mozilla.firefox/.mozilla/
To avoid confusing myself, it's also a good idea to rename the local data directory:
$ mv ~/.mozilla/firefox ~/.mozilla/firefox.old.$(date --iso-8601=date)
At this point, flatpak run org.mozilla.firefox takes me to my "usual" everyday Firefox, with all its tabs opened, pinned, bookmarked, etc. More integration? After following all the steps above, I must say that I'm 99% happy. So far, everything works as before, I didn't hit any issue, and I don't even notice that Firefox is running via Flatpak, it's completely transparent. So where's the 1% of unhappiness? The Run a Command dialog from GNOME, the one that shows up via the keyboard shortcut <Alt+F2>. This is how I start my GUI applications, and I usually run two Firefox instances in parallel (one for work, one for personal), using the firefox -p <profile> command. Given that I ran apt purge firefox before (to avoid confusing myself with two installations of Firefox), now the right (and only) way to start Firefox from a command-line is to type flatpak run org.mozilla.firefox -p <profile>. Typing that every time is way too cumbersome, so I need something quicker. Seems like the most straightforward is to create a wrapper script:
$ cat /usr/local/bin/firefox 
#!/bin/sh
exec flatpak run org.mozilla.firefox "$@"
And now I can just hit <Alt+F2> and type firefox -p <profile> to start Firefox with the profile I want, just as before. Neat! Looking forward: system updates I usually update my system manually every now and then, via the well-known pair of commands:
$ sudo apt update
$ sudo apt full-upgrade
The downside of introducing Flatpak, ie. introducing another package manager, is that I'll need to learn new commands to update the software that comes via this channel. Fortunately, there's really not much to learn. From flatpak-update(1):
flatpak update [OPTION...] [REF...] Updates applications and runtimes. [...] If no REF is given, everything is updated, as well as appstream info for all remotes.
Could it be that simple? Apparently yes, the Flatpak equivalent of the two apt commands above is just:
$ flatpak update
Going forward, my options are:
  1. Teach myself to run flatpak update additionally to apt update, manually, everytime I update my system.
  2. Go crazy: let something automatically update my Flatpak apps, in my back and without my consent.
I'm actually tempted to go for option 2 here, and I wonder if GNOME Software will do that for me, provided that I installed gnome-software-plugin-flatpak, and that I checked Software Updates -> Automatic in the Settings (which I did). However, I didn't find any documentation regarding what this setting really does, so I can't say if it will only download updates, or if it will also install it. I'd be happy if it automatically installs new version of Flatpak apps, but at the same time I'd be very unhappy if it automatically upgrades my Debian system... So we'll see. Enough for today, hope this blog post was useful!

2 April 2024

Bits from Debian: Bits from the DPL

Dear Debianites This morning I decided to just start writing Bits from DPL and send whatever I have by 18:00 local time. Here it is, barely proof read, along with all it's warts and grammar mistakes! It's slightly long and doesn't contain any critical information, so if you're not in the mood, don't feel compelled to read it! Get ready for a new DPL! Soon, the voting period will start to elect our next DPL, and my time as DPL will come to an end. Reading the questions posted to the new candidates on debian-vote, it takes quite a bit of restraint to not answer all of them myself, I think I can see how that aspect contributed to me being reeled in to running for DPL! In total I've done so 5 times (the first time I ran, Sam was elected!). Good luck to both Andreas and Sruthi, our current DPL candidates! I've already started working on preparing handover, and there's multiple request from teams that have came in recently that will have to wait for the new term, so I hope they're both ready to hit the ground running! Things that I wish could have gone better Communication Recently, I saw a t-shirt that read:
Adulthood is saying, 'But after this week things will slow down a bit' over and over until you die.
I can relate! With every task, crisis or deadline that appears, I think that once this is over, I'll have some more breathing space to get back to non-urgent, but important tasks. "Bits from the DPL" was something I really wanted to get right this last term, and clearly failed spectacularly. I have two long Bits from the DPL drafts that I never finished, I tend to have prioritised problems of the day over communication. With all the hindsight I have, I'm not sure which is better to prioritise, I do rate communication and transparency very highly and this is really the top thing that I wish I could've done better over the last four years. On that note, thanks to people who provided me with some kind words when I've mentioned this to them before. They pointed out that there are many other ways to communicate and be in touch with the community, and they mentioned that they thought that I did a good job with that. Since I'm still on communication, I think we can all learn to be more effective at it, since it's really so important for the project. Every time I publicly spoke about us spending more money, we got more donations. People out there really like to see how we invest funds in to Debian, instead of just making it heap up. DSA just spent a nice chunk on money on hardware, but we don't have very good visibility on it. It's one thing having it on a public line item in SPI's reporting, but it would be much more exciting if DSA could provide a write-up on all the cool hardware they're buying and what impact it would have on developers, and post it somewhere prominent like debian-devel-announce, Planet Debian or Bits from Debian (from the publicity team). I don't want to single out DSA there, it's difficult and affects many other teams. The Salsa CI team also spent a lot of resources (time and money wise) to extend testing on AMD GPUs and other AMD hardware. It's fantastic and interesting work, and really more people within the project and in the outside world should know about it! I'm not going to push my agendas to the next DPL, but I hope that they continue to encourage people to write about their work, and hopefully at some point we'll build enough excitement in doing so that it becomes a more normal part of our daily work. Founding Debian as a standalone entity This was my number one goal for the project this last term, which was a carried over item from my previous terms. I'm tempted to write everything out here, including the problem statement and our current predicaments, what kind of ground work needs to happen, likely constitutional changes that need to happen, and the nature of the GR that would be needed to make such a thing happen, but if I start with that, I might not finish this mail. In short, I 100% believe that this is still a very high ranking issue for Debian, and perhaps after my term I'd be in a better position to spend more time on this (hmm, is this an instance of "The grass is always better on the other side", or "Next week will go better until I die?"). Anyway, I'm willing to work with any future DPL on this, and perhaps it can in itself be a delegation tasked to properly explore all the options, and write up a report for the project that can lead to a GR. Overall, I'd rather have us take another few years and do this properly, rather than rush into something that is again difficult to change afterwards. So while I very much wish this could've been achieved in the last term, I can't say that I have any regrets here either. My terms in a nutshell COVID-19 and Debian 11 era My first term in 2020 started just as the COVID-19 pandemic became known to spread globally. It was a tough year for everyone, and Debian wasn't immune against its effects either. Many of our contributors got sick, some have lost loved ones (my father passed away in March 2020 just after I became DPL), some have lost their jobs (or other earners in their household have) and the effects of social distancing took a mental and even physical health toll on many. In Debian, we tend to do really well when we get together in person to solve problems, and when DebConf20 got cancelled in person, we understood that that was necessary, but it was still more bad news in a year we had too much of it already. I can't remember if there was ever any kind of formal choice or discussion about this at any time, but the DebConf video team just kind of organically and spontaneously became the orga team for an online DebConf, and that lead to our first ever completely online DebConf. This was great on so many levels. We got to see each other's faces again, even though it was on screen. We had some teams talk to each other face to face for the first time in years, even though it was just on a Jitsi call. It had a lasting cultural change in Debian, some teams still have video meetings now, where they didn't do that before, and I think it's a good supplement to our other methods of communication. We also had a few online Mini-DebConfs that was fun, but DebConf21 was also online, and by then we all developed an online conference fatigue, and while it was another good online event overall, it did start to feel a bit like a zombieconf and after that, we had some really nice events from the Brazillians, but no big global online community events again. In my opinion online MiniDebConfs can be a great way to develop our community and we should spend some further energy into this, but hey! This isn't a platform so let me back out of talking about the future as I see it... Despite all the adversity that we faced together, the Debian 11 release ended up being quite good. It happened about a month or so later than what we ideally would've liked, but it was a solid release nonetheless. It turns out that for quite a few people, staying inside for a few months to focus on Debian bugs was quite productive, and Debian 11 ended up being a very polished release. During this time period we also had to deal with a previous Debian Developer that was expelled for his poor behaviour in Debian, who continued to harass members of the Debian project and in other free software communities after his expulsion. This ended up being quite a lot of work since we had to take legal action to protect our community, and eventually also get the police involved. I'm not going to give him the satisfaction by spending too much time talking about him, but you can read our official statement regarding Daniel Pocock here: https://www.debian.org/News/2021/20211117 In late 2021 and early 2022 we also discussed our general resolution process, and had two consequent votes to address some issues that have affected past votes: In my first term I addressed our delegations that were a bit behind, by the end of my last term all delegation requests are up to date. There's still some work to do, but I'm feeling good that I get to hand this over to the next DPL in a very decent state. Delegation updates can be very deceiving, sometimes a delegation is completely re-written and it was just 1 or 2 hours of work. Other times, a delegation updated can contain one line that has changed or a change in one team member that was the result of days worth of discussion and hashing out differences. I also received quite a few requests either to host a service, or to pay a third-party directly for hosting. This was quite an admin nightmare, it either meant we had to manually do monthly reimbursements to someone, or have our TOs create accounts/agreements at the multiple providers that people use. So, after talking to a few people about this, we founded the DebianNet team (we could've admittedly chosen a better name, but that can happen later on) for providing hosting at two different hosting providers that we have agreement with so that people who host things under debian.net have an easy way to host it, and then at the same time Debian also has more control if a site maintainer goes MIA. More info: https://wiki.debian.org/Teams/DebianNet You might notice some Openstack mentioned there, we had some intention to set up a Debian cloud for hosting these things, that could also be used for other additional Debiany things like archive rebuilds, but these have so far fallen through. We still consider it a good idea and hopefully it will work out some other time (if you're a large company who can sponsor few racks and servers, please get in touch!) DebConf22 and Debian 12 era DebConf22 was the first time we returned to an in-person DebConf. It was a bit smaller than our usual DebConf - understandably so, considering that there were still COVID risks and people who were at high risk or who had family with high risk factors did the sensible thing and stayed home. After watching many MiniDebConfs online, I also attended my first ever MiniDebConf in Hamburg. It still feels odd typing that, it feels like I should've been at one before, but my location makes attending them difficult (on a side-note, a few of us are working on bootstrapping a South African Debian community and hopefully we can pull off MiniDebConf in South Africa later this year). While I was at the MiniDebConf, I gave a talk where I covered the evolution of firmware, from the simple e-proms that you'd find in old printers to the complicated firmware in modern GPUs that basically contain complete operating systems- complete with drivers for the device their running on. I also showed my shiny new laptop, and explained that it's impossible to install that laptop without non-free firmware (you'd get a black display on d-i or Debian live). Also that you couldn't even use an accessibility mode with audio since even that depends on non-free firmware these days. Steve, from the image building team, has said for a while that we need to do a GR to vote for this, and after more discussion at DebConf, I kept nudging him to propose the GR, and we ended up voting in favour of it. I do believe that someone out there should be campaigning for more free firmware (unfortunately in Debian we just don't have the resources for this), but, I'm glad that we have the firmware included. In the end, the choice comes down to whether we still want Debian to be installable on mainstream bare-metal hardware. At this point, I'd like to give a special thanks to the ftpmasters, image building team and the installer team who worked really hard to get the changes done that were needed in order to make this happen for Debian 12, and for being really proactive for remaining niggles that was solved by the time Debian 12.1 was released. The included firmware contributed to Debian 12 being a huge success, but it wasn't the only factor. I had a list of personal peeves, and as the hard freeze hit, I lost hope that these would be fixed and made peace with the fact that Debian 12 would release with those bugs. I'm glad that lots of people proved me wrong and also proved that it's never to late to fix bugs, everything on my list got eliminated by the time final freeze hit, which was great! We usually aim to have a release ready about 2 years after the previous release, sometimes there are complications during a freeze and it can take a bit longer. But due to the excellent co-ordination of the release team and heavy lifting from many DDs, the Debian 12 release happened 21 months and 3 weeks after the Debian 11 release. I hope the work from the release team continues to pay off so that we can achieve their goals of having shorter and less painful freezes in the future! Even though many things were going well, the ongoing usr-merge effort highlighted some social problems within our processes. I started typing out the whole history of usrmerge here, but it's going to be too long for the purpose of this mail. Important questions that did come out of this is, should core Debian packages be team maintained? And also about how far the CTTE should really be able to override a maintainer. We had lots of discussion about this at DebConf22, but didn't make much concrete progress. I think that at some point we'll probably have a GR about package maintenance. Also, thank you to Guillem who very patiently explained a few things to me (after probably having have to done so many times to others before already) and to Helmut who have done the same during the MiniDebConf in Hamburg. I think all the technical and social issues here are fixable, it will just take some time and patience and I have lots of confidence in everyone involved. UsrMerge wiki page: https://wiki.debian.org/UsrMerge DebConf 23 and Debian 13 era DebConf23 took place in Kochi, India. At the end of my Bits from the DPL talk there, someone asked me what the most difficult thing I had to do was during my terms as DPL. I answered that nothing particular stood out, and even the most difficult tasks ended up being rewarding to work on. Little did I know that my most difficult period of being DPL was just about to follow. During the day trip, one of our contributors, Abraham Raji, passed away in a tragic accident. There's really not anything anyone could've done to predict or stop it, but it was devastating to many of us, especially the people closest to him. Quite a number of DebConf attendees went to his funeral, wearing the DebConf t-shirts he designed as a tribute. It still haunts me when I saw his mother scream "He was my everything! He was my everything!", this was by a large margin the hardest day I've ever had in Debian, and I really wasn't ok for even a few weeks after that and I think the hurt will be with many of us for some time to come. So, a plea again to everyone, please take care of yourself! There's probably more people that love you than you realise. A special thanks to the DebConf23 team, who did a really good job despite all the uphills they faced (and there were many!). As DPL, I think that planning for a DebConf is near to impossible, all you can do is show up and just jump into things. I planned to work with Enrico to finish up something that will hopefully save future DPLs some time, and that is a web-based DD certificate creator instead of having the DPL do so manually using LaTeX. It already mostly works, you can see the work so far by visiting https://nm.debian.org/person/ACCOUNTNAME/certificate/ and replacing ACCOUNTNAME with your Debian account name, and if you're a DD, you should see your certificate. It still needs a few minor changes and a DPL signature, but at this point I think that will be finished up when the new DPL start. Thanks to Enrico for working on this! Since my first term, I've been trying to find ways to improve all our accounting/finance issues. Tracking what we spend on things, and getting an annual overview is hard, especially over 3 trusted organisations. The reimbursement process can also be really tedious, especially when you have to provide files in a certain order and combine them into a PDF. So, at DebConf22 we had a meeting along with the treasurer team and Stefano Rivera who said that it might be possible for him to work on a new system as part of his Freexian work. It worked out, and Freexian funded the development of the system since then, and after DebConf23 we handled the reimbursements for the conference via the new reimbursements site: https://reimbursements.debian.net/ It's still early days, but over time it should be linked to all our TOs and we'll use the same category codes across the board. So, overall, our reimbursement process becomes a lot simpler, and also we'll be able to get information like how much money we've spent on any category in any period. It will also help us to track how much money we have available or how much we spend on recurring costs. Right now that needs manual polling from our TOs. So I'm really glad that this is a big long-standing problem in the project that is being fixed. For Debian 13, we're waving goodbye to the KFreeBSD and mipsel ports. But we're also gaining riscv64 and loongarch64 as release architectures! I have 3 different RISC-V based machines on my desk here that I haven't had much time to work with yet, you can expect some blog posts about them soon after my DPL term ends! As Debian is a unix-like system, we're affected by the Year 2038 problem, where systems that uses 32 bit time in seconds since 1970 run out of available time and will wrap back to 1970 or have other undefined behaviour. A detailed wiki page explains how this works in Debian, and currently we're going through a rather large transition to make this possible. I believe this is the right time for Debian to be addressing this, we're still a bit more than a year away for the Debian 13 release, and this provides enough time to test the implementation before 2038 rolls along. Of course, big complicated transitions with dependency loops that causes chaos for everyone would still be too easy, so this past weekend (which is a holiday period in most of the west due to Easter weekend) has been filled with dealing with an upstream bug in xz-utils, where a backdoor was placed in this key piece of software. An Ars Technica covers it quite well, so I won't go into all the details here. I mention it because I want to give yet another special thanks to everyone involved in dealing with this on the Debian side. Everyone involved, from the ftpmasters to security team and others involved were super calm and professional and made quick, high quality decisions. This also lead to the archive being frozen on Saturday, this is the first time I've seen this happen since I've been a DD, but I'm sure next week will go better! Looking forward It's really been an honour for me to serve as DPL. It might well be my biggest achievement in my life. Previous DPLs range from prominent software engineers to game developers, or people who have done things like complete Iron Man, run other huge open source projects and are part of big consortiums. Ian Jackson even authored dpkg and is now working on the very interesting tag2upload service! I'm a relative nobody, just someone who grew up as a poor kid in South Africa, who just really cares about Debian a lot. And, above all, I'm really thankful that I didn't do anything major to screw up Debian for good. Not unlike learning how to use Debian, and also becoming a Debian Developer, I've learned a lot from this and it's been a really valuable growth experience for me. I know I can't possible give all the thanks to everyone who deserves it, so here's a big big thanks to everyone who have worked so hard and who have put in many, many hours to making Debian better, I consider you all heroes! -Jonathan

1 April 2024

Colin Watson: Free software activity in March 2024

My Debian contributions this month were all sponsored by Freexian.

Simon Josefsson: Towards reproducible minimal source code tarballs? On *-src.tar.gz

While the work to analyze the xz backdoor is in progress, several ideas have been suggested to improve the software supply chain ecosystem. Some of those ideas are good, some of the ideas are at best irrelevant and harmless, and some suggestions are plain bad. I d like to attempt to formalize two ideas, which have been discussed before, but the context in which they can be appreciated have not been as clear as it is today.
  1. Reproducible tarballs. The idea is that published source tarballs should be possible to reproduce independently somehow, and that this should be continuously tested and verified preferrably as part of the upstream project continuous integration system (e.g., GitHub action or GitLab pipeline). While nominally this looks easy to achieve, there are some complex matters in this, for example: what timestamps to use for files in the tarball? I ve brought up this aspect before.
  2. Minimal source tarballs without generated vendor files. Most GNU Autoconf/Automake-based tarballs pre-generated files which are important for bootstrapping on exotic systems that does not have the required dependencies. For the bootstrapping story to succeed, this approach is important to support. However it has become clear that this practice raise significant costs and risks. Most modern GNU/Linux distributions have all the required dependencies and actually prefers to re-build everything from source code. These pre-generated extra files introduce uncertainty to that process.
My strawman proposal to improve things is to define new tarball format *-src.tar.gz with at least the following properties:
  1. The tarball should allow users to build the project, which is the entire purpose of all this. This means that at least all source code for the project has to be included.
  2. The tarballs should be signed, for example with PGP or minisign.
  3. The tarball should be possible to reproduce bit-by-bit by a third party using upstream s version controlled sources and a pointer to which revision was used (e.g., git tag or git commit).
  4. The tarball should not require an Internet connection to download things.
    • Corollary: every external dependency either has to be explicitly documented as such (e.g., gcc and GnuTLS), or included in the tarball.
    • Observation: This means including all *.po gettext translations which are normally downloaded when building from version controlled sources.
  5. The tarball should contain everything required to build the project from source using as much externally released versioned tooling as possible. This is the minimal property lacking today.
    • Corollary: This means including a vendored copy of OpenSSL or libz is not acceptable: link to them as external projects.
    • Open question: How about non-released external tooling such as gnulib or autoconf archive macros? This is a bit more delicate: most distributions either just package one current version of gnulib or autoconf archive, not previous versions. While this could change, and distributions could package the gnulib git repository (up to some current version) and the autoconf archive git repository and packages were set up to extract the version they need (gnulib s ./bootstrap already supports this via the gnulib-refdir parameter), this is not normally in place.
    • Suggested Corollary: The tarball should contain content from git submodule s such as gnulib and the necessary Autoconf archive M4 macros required by the project.
  6. Similar to how the GNU project specify the ./configure interface we need a documented interface for how to bootstrap the project. I suggest to use the already well established idiom of running ./bootstrap to set up the package to later be able to be built via ./configure. Of course, some projects are not using the autotool ./configure interface and will not follow this aspect either, but like most build systems that compete with autotools have instructions on how to build the project, they should document similar interfaces for bootstrapping the source tarball to allow building.
If tarballs that achieve the above goals were available from popular upstream projects, distributions could more easily use them instead of current tarballs that include pre-generated content. The advantage would be that the build process is not tainted by unnecessary files. We need to develop tools for maintainers to create these tarballs, similar to make dist that generate today s foo-1.2.3.tar.gz files. I think one common argument against this approach will be: Why bother with all that, and just use git-archive outputs? Or avoid the entire tarball approach and move directly towards version controlled check outs and referring to upstream releases as git URL and commit tag or id. One problem with this is that SHA-1 is broken, so placing trust in a SHA-1 identifier is simply not secure. Another counter-argument is that this optimize for packagers benefits at the cost of upstream maintainers: most upstream maintainers do not want to store gettext *.po translations in their source code repository. A compromise between the needs of maintainers and packagers is useful, so this *-src.tar.gz tarball approach is the indirection we need to solve that. Update: In my experiment with source-only tarballs for Libntlm I actually did use git-archive output. What do you think?

Arturo Borrero Gonz lez: Kubecon and CloudNativeCon 2024 Europe summary

Kubecon EU 2024 Paris logo This blog post shares my thoughts on attending Kubecon and CloudNativeCon 2024 Europe in Paris. It was my third time at this conference, and it felt bigger than last year s in Amsterdam. Apparently it had an impact on public transport. I missed part of the opening keynote because of the extremely busy rush hour tram in Paris. On Artificial Intelligence, Machine Learning and GPUs Talks about AI, ML, and GPUs were everywhere this year. While it wasn t my main interest, I did learn about GPU resource sharing and power usage on Kubernetes. There were also ideas about offering Models-as-a-Service, which could be cool for Wikimedia Toolforge in the future. See also: On security, policy and authentication This was probably the main interest for me in the event, given Wikimedia Toolforge was about to migrate away from Pod Security Policy, and we were currently evaluating different alternatives. In contrast to my previous attendances to Kubecon, where there were three policy agents with presence in the program schedule, Kyverno, Kubewarden and OpenPolicyAgent (OPA), this time only OPA had the most relevant sessions. One surprising bit I got from one of the OPA sessions was that it could work to authorize linux PAM sessions. Could this be useful for Wikimedia Toolforge? OPA talk I attended several sessions related to authentication topics. I discovered the keycloak software, which looks very promising. I also attended an Oauth2 session which I had a hard time following, because I clearly missed some additional knowledge about how Oauth2 works internally. I also attended a couple of sessions that ended up being a vendor sales talk. See also: On container image builds, harbor registry, etc This topic was also of interest to me because, again, it is a core part of Wikimedia Toolforge. I attended a couple of sessions regarding container image builds, including topics like general best practices, image minimization, and buildpacks. I learned about kpack, which at first sight felt like a nice simplification of how the Toolforge build service was implemented. I also attended a session by the Harbor project maintainers where they shared some valuable information on things happening soon or in the future , for example: On networking I attended a couple of sessions regarding networking. One session in particular I paid special attention to, ragarding on network policies. They discussed new semantics being added to the Kubernetes API. The different layers of abstractions being added to the API, the different hook points, and override layers clearly resembled (to me at least) the network packet filtering stack of the linux kernel (netfilter), but without the 20 (plus) years of experience building the right semantics and user interfaces. Network talk I very recently missed some semantics for limiting the number of open connections per namespace, see Phabricator T356164: [toolforge] several tools get periods of connection refused (104) when connecting to wikis This functionality should be available in the lower level tools, I mean Netfilter. I may submit a proposal upstream at some point, so they consider adding this to the Kubernetes API. Final notes In general, I believe I learned many things, and perhaps even more importantly I re-learned some stuff I had forgotten because of lack of daily exposure. I m really happy that the cloud native way of thinking was reinforced in me, which I still need because most of my muscle memory to approach systems architecture and engineering is from the old pre-cloud days. That being said, I felt less engaged with the content of the conference schedule compared to last year. I don t know if the schedule itself was less interesting, or that I m losing interest? Finally, not an official track in the conference, but we met a bunch of folks from Wikimedia Deutschland. We had a really nice time talking about how wikibase.cloud uses Kubernetes, whether they could run in Wikimedia Cloud Services, and why structured data is so nice. Group photo

29 March 2024

Rapha&#235;l Hertzog: Freexian is looking to expand its team with more Debian contributors

It s been a while that I haven t posted anything on my blog, the truth is that Freexian has been doing very well in the last years and that I have a hard time to allocate time to write articles or even to contribute to my usual Debian projects the exception being debusine since that s part of the Freexian work (have a look at our most recent announce!).
That being said, given Freexian s growth and in the hope to reduce my workload, we are looking to extend our team with Debian members of more varied backgrounds and skills, so they can help us in areas like sales / marketing / project management. Have a look at our announce on debian-jobs@lists.debian.org.
As a mission-oriented company, we are looking to work with persons already involved in Debian (or persons who were waiting the right opportunity to get involved). All our collaborators can spend 20% of their paid work time on the Debian projects they care about.

Patryk Cisek: Sanoid on TrueNAS

syncoid to TrueNAS In my homelab, I have 2 NAS systems: Linux (Debian) TrueNAS Core (based on FreeBSD) On my Linux box, I use Jim Salter s sanoid to periodically take snapshots of my ZFS pool. I also want to have a proper backup of the whole pool, so I use syncoid to transfer those snapshots to another machine. Sanoid itself is responsible only for taking new snapshots and pruning old ones you no longer care about.

25 March 2024

Valhalla's Things: Piecepack and postcard boxes

Posted on March 25, 2024
Tags: madeof:bits, craft:cartonnage
This article has been originally posted on November 4, 2023, and has been updated (at the bottom) since.
An open cardboard box, showing the lining in paper printed with a medieval music manuscript. Thanks to All Saints Day, I ve just had a 5 days weekend. One of those days I woke up and decided I absolutely needed a cartonnage box for the cardboard and linocut piecepack I ve been working on for quite some time. I started drawing a plan with measures before breakfast, then decided to change some important details, restarted from scratch, did a quick dig through the bookbinding materials and settled on 2 mm cardboard for the structure, black fabric-like paper for the outside and a scrap of paper with a manuscript print for the inside. Then we had the only day with no rain among the five, so some time was spent doing things outside, but on the next day I quickly finished two boxes, at two different heights. The weather situation also meant that while I managed to take passable pictures of the first stages of the box making in natural light, the last few stages required some creative artificial lightning, even if it wasn t that late in the evening. I need to build1 myself a light box. And then decided that since they are C6 sized, they also work well for postcards or for other A6 pieces of paper, so I will probably need to make another one when the piecepack set will be finally finished. The original plan was to use a linocut of the piecepack suites as the front cover; I don t currently have one ready, but will make it while printing the rest of the piecepack set. One day :D an open rectangular cardboard box, with a plastic piecepack set in it. One of the boxes was temporarily used for the plastic piecepack I got with the book, and that one works well, but since it s a set with standard suites I think I will want to make another box, using some of the paper with fleur-de-lis that I saw in the stash. I ve also started to write detailed instructions: I will publish them as soon as they are ready, and then either update this post, or they will be mentioned in an additional post if I will have already made more boxes in the meanwhile.
Update 2024-03-25: the instructions have been published on my craft patterns website

  1. you don t really expect me to buy one, right? :D

24 March 2024

Niels Thykier: debputy v0.1.21

Earlier today, I have just released debputy version 0.1.21 to Debian unstable. In the blog post, I will highlight some of the new features.
Package boilerplate reduction with automatic relationship substvar Last month, I started a discussion on rethinking how we do relationship substvars such as the $ misc:Depends . These generally ends up being boilerplate runes in the form of Depends: $ misc:Depends , $ shlibs:Depends where you as the packager has to remember exactly which runes apply to your package. My proposed solution was to automatically apply these substvars and this feature has now been implemented in debputy. It is also combined with the feature where essential packages should use Pre-Depends by default for dpkg-shlibdeps related dependencies. I am quite excited about this feature, because I noticed with libcleri that we are now down to 3-5 fields for defining a simple library package. Especially since most C library packages are trivial enough that debputy can auto-derive them to be Multi-Arch: same. As an example, the libcleric1 package is down to 3 fields (Package, Architecture, Description) with Section and Priority being inherited from the Source stanza. I have submitted a MR to show case the boilerplate reduction at https://salsa.debian.org/siridb-team/libcleri/-/merge_requests/3. The removal of libcleric1 (= $ binary:Version ) in that MR relies on another existing feature where debputy can auto-derive a dependency between an arch:any -dev package and the library package based on the .so symlink for the shared library. The arch:any restriction comes from the fact that arch:all and arch:any packages are not built together, so debputy cannot reliably see across the package boundaries during the build (and therefore refuses to do so at all). Packages that have already migrated to debputy can use debputy migrate-from-dh to detect any unnecessary relationship substitution variables in case you want to clean up. The removal of Multi-Arch: same and intra-source dependencies must be done manually and so only be done so when you have validated that it is safe and sane to do. I was willing to do it for the show-case MR, but I am less confident that would bother with these for existing packages in general. Note: I summarized the discussion of the automatic relationship substvar feature earlier this month in https://lists.debian.org/debian-devel/2024/03/msg00030.html for those who want more details. PS: The automatic relationship substvars feature will also appear in debhelper as a part of compat 14.
Language Server (LSP) and Linting I have long been frustrated by our poor editor support for Debian packaging files. To this end, I started working on a Language Server (LSP) feature in debputy that would cover some of our standard Debian packaging files. This release includes the first version of said language server, which covers the following files:
  • debian/control
  • debian/copyright (the machine readable variant)
  • debian/changelog (mostly just spelling)
  • debian/rules
  • debian/debputy.manifest (syntax checks only; use debputy check-manifest for the full validation for now)
Most of the effort has been spent on the Deb822 based files such as debian/control, which comes with diagnostics, quickfixes, spellchecking (but only for relevant fields!), and completion suggestions. Since not everyone has a LSP capable editor and because sometimes you just want diagnostics without having to open each file in an editor, there is also a batch version for the diagnostics via debputy lint. Please see debputy(1) for how debputy lint compares with lintian if you are curious about which tool to use at what time. To help you getting started, there is a now debputy lsp editor-config command that can provide you with the relevant editor config glue. At the moment, emacs (via eglot) and vim with vim-youcompleteme are supported. For those that followed the previous blog posts on writing the language server, I would like to point out that the command line for running the language server has changed to debputy lsp server and you no longer have to tell which format it is. I have decided to make the language server a "polyglot" server for now, which I will hopefully not regret... Time will tell. :) Anyhow, to get started, you will want:
$ apt satisfy 'dh-debputy (>= 0.1.21~), python3-pygls'
# Optionally, for spellchecking
$ apt install python3-hunspell hunspell-en-us
# For emacs integration
$ apt install elpa-dpkg-dev-el markdown-mode-el
# For vim integration via vim-youcompleteme
$ apt install vim-youcompleteme
Specifically for emacs, I also learned two things after the upload. First, you can auto-activate eglot via eglot-ensure. This badly feature interacts with imenu on debian/changelog for reasons I do not understand (causing a several second start up delay until something times out), but it works fine for the other formats. Oddly enough, opening a changelog file and then activating eglot does not trigger this issue at all. In the next version, editor config for emacs will auto-activate eglot on all files except debian/changelog. The second thing is that if you install elpa-markdown-mode, emacs will accept and process markdown in the hover documentation provided by the language server. Accordingly, the editor config for emacs will also mention this package from the next version on. Finally, on a related note, Jelmer and I have been looking at moving some of this logic into a new package called debpkg-metadata. The point being to support easier reuse of linting and LSP related metadata - like pulling a list of known fields for debian/control or sharing logic between lintian-brush and debputy.
Minimal integration mode for Rules-Requires-Root One of the original motivators for starting debputy was to be able to get rid of fakeroot in our build process. While this is possible, debputy currently does not support most of the complex packaging features such as maintscripts and debconf. Unfortunately, the kind of packages that need fakeroot for static ownership tend to also require very complex packaging features. To bridge this gap, the new version of debputy supports a very minimal integration with dh via the dh-sequence-zz-debputy-rrr. This integration mode keeps the vast majority of debhelper sequence in place meaning most dh add-ons will continue to work with dh-sequence-zz-debputy-rrr. The sequence only replaces the following commands:
  • dh_fixperms
  • dh_gencontrol
  • dh_md5sums
  • dh_builddeb
The installations feature of the manifest will be disabled in this integration mode to avoid feature interactions with debhelper tools that expect debian/<pkg> to contain the materialized package. On a related note, the debputy migrate-from-dh command now supports a --migration-target option, so you can choose the desired level of integration without doing code changes. The command will attempt to auto-detect the desired integration from existing package features such as a build-dependency on a relevant dh sequence, so you do not have to remember this new option every time once the migration has started. :)

Jacob Adams: Regular Reboots

Uptime is often considered a measure of system reliability, an indication that the running software is stable and can be counted on. However, this hides the insidious build-up of state throughout the system as it runs, the slow drift from the expected to the strange. As Nolan Lawson highlights in an excellent post entitled Programmers are bad at managing state, state is the most challenging part of programming. It s why did you try turning it off and on again is a classic tech support response to any problem. In addition to the problem of state, installing regular updates periodically requires a reboot, even if the rest of the process is automated through a tool like unattended-upgrades. For my personal homelab, I manage a handful of different machines running various services. I used to just schedule a day to update and reboot all of them, but that got very tedious very quickly. I then moved the reboot to a cronjob, and then recently to a systemd timer and service. I figure that laying out my path to better management of this might help others, and will almost certainly lead to someone telling me a better way to do this. UPDATE: Turns out there s another option for better systemd cron integration. See systemd-cron below.

Stage One: Reboot Cron The first, and easiest approach, is a simple cron job. Just adding the following line to /var/spool/cron/crontabs/root1 is enough to get your machine to reboot once a month2 on the 6th at 8:00 AM3:
0 8 6 * * reboot
I had this configured for many years and it works well. But you have no indication as to whether it succeeds except for checking your uptime regularly yourself.

Stage Two: Reboot systemd Timer The next evolution of this approach for me was to use a systemd timer. I created a regular-reboot.timer with the following contents:
[Unit]
Description=Reboot on a Regular Basis
[Timer]
Unit=regular-reboot.service
OnBootSec=1month
[Install]
WantedBy=timers.target
This timer will trigger the regular-reboot.service systemd unit when the system reaches one month of uptime. I ve seen some guides to creating timer units recommend adding a Wants=regular-reboot.service to the [Unit] section, but this has the consequence of running that service every time it starts the timer. In this case that will just reboot your system on startup which is not what you want. Care needs to be taken to use the OnBootSec directive instead of OnCalendar or any of the other time specifications, as your system could reboot, discover its still within the expected window and reboot again. With OnBootSec your system will not have that problem. Technically, this same problem could have occurred with the cronjob approach, but in practice it never did, as the systems took long enough to come back up that they were no longer within the expected window for the job. I then added the regular-reboot.service:
[Unit]
Description=Reboot on a Regular Basis
Wants=regular-reboot.timer
[Service]
Type=oneshot
ExecStart=shutdown -r 02:45
You ll note that this service is actually scheduling a specific reboot time via the shutdown command instead of just immediately rebooting. This is a bit of a hack needed because I can t control when the timer runs exactly when using OnBootSec. This way different systems have different reboot times so that everything doesn t just reboot and fail all at once. Were something to fail to come back up I would have some time to fix it, as each machine has a few hours between scheduled reboots. One you have both files in place, you ll simply need to reload configuration and then enable and start the timer unit:
systemctl daemon-reload
systemctl enable --now regular-reboot.timer
You can then check when it will fire next:
# systemctl status regular-reboot.timer
  regular-reboot.timer - Reboot on a Regular Basis
     Loaded: loaded (/etc/systemd/system/regular-reboot.timer; enabled; preset: enabled)
     Active: active (waiting) since Wed 2024-03-13 01:54:52 EDT; 1 week 4 days ago
    Trigger: Fri 2024-04-12 12:24:42 EDT; 2 weeks 4 days left
   Triggers:   regular-reboot.service
Mar 13 01:54:52 dorfl systemd[1]: Started regular-reboot.timer - Reboot on a Regular Basis.

Sidenote: Replacing all Cron Jobs with systemd Timers More generally, I ve now replaced all cronjobs on my personal systems with systemd timer units, mostly because I can now actually track failures via prometheus-node-exporter. There are plenty of ways to hack in cron support to the node exporter, but just moving to systemd units provides both support for tracking failure and logging, both of which make system administration much easier when things inevitably go wrong.

systemd-cron An alternative to converting everything by hand, if you happen to have a lot of cronjobs is systemd-cron. It will make each crontab and /etc/cron.* directory into automatic service and timer units. Thanks to Alexandre Detiste for letting me know about this project. I have few enough cron jobs that I ve already converted, but for anyone looking at a large number of jobs to convert you ll want to check it out!

Stage Three: Monitor that it s working The final step here is confirm that these units actually work, beyond just firing regularly. I now have the following rule in my prometheus-alertmanager rules:
  - alert: UptimeTooHigh
    expr: (time() - node_boot_time_seconds job="node" ) / 86400 > 35
    annotations:
      summary: "Instance  Has Been Up Too Long!"
      description: "Instance  Has Been Up Too Long!"
This will trigger an alert anytime that I have a machine up for more than 35 days. This actually helped me track down one machine that I had forgotten to set up this new unit on4.

Not everything needs to scale Is It Worth The Time One of the most common fallacies programmers fall into is that we will jump to automating a solution before we stop and figure out how much time it would even save. In taking a slow improvement route to solve this problem for myself, I ve managed not to invest too much time5 in worrying about this but also achieved a meaningful improvement beyond my first approach of doing it all by hand.
  1. You could also add a line to /etc/crontab or drop a script into /etc/cron.monthly depending on your system.
  2. Why once a month? Mostly to avoid regular disruptions, but still be reasonably timely on updates.
  3. If you re looking to understand the cron time format I recommend crontab guru.
  4. In the long term I really should set up something like ansible to automatically push fleetwide changes like this but with fewer machines than fingers this seems like overkill.
  5. Of course by now writing about it, I ve probably doubled the amount of time I ve spent thinking about this topic but oh well

23 March 2024

Dirk Eddelbuettel: littler 0.3.20 on CRAN: Moar Features!

max-heap image The twentyfirst release of littler as a CRAN package landed on CRAN just now, following in the now eighteen year history (!!) as a package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R as it predates Rscript. It allows for piping as well for shebang scripting via #!, uses command-line arguments more consistently and still starts faster. It also always loaded the methods package which Rscript only began to do in recent years. littler lives on Linux and Unix, has its difficulties on macOS due to yet-another-braindeadedness there (who ever thought case-insensitive filesystems as a default were a good idea?) and simply does not exist on Windows (yet the build system could be extended see RInside for an existence proof, and volunteers are welcome!). See the FAQ vignette on how to add it to your PATH. A few examples are highlighted at the Github repo:, as well as in the examples vignette. This release contains another fair number of small changes and improvements to some of the scripts I use daily to build or test packages, adds a new front-end ciw.r for the recently-released ciw package offering a CRAN Incoming Watcher , a new helper installDeps2.r (extending installDeps.r), a new doi-to-bib converter, allows a different temporary directory setup I find helpful, deals with one corner deployment use, and more. The full change description follows.

Changes in littler version 0.3.20 (2024-03-23)
  • Changes in examples scripts
    • New (dependency-free) helper installDeps2.r to install dependencies
    • Scripts rcc.r, tt.r, tttf.r, tttlr.r use env argument -S to set -t to r
    • tt.r can now fill in inst/tinytest if it is present
    • New script ciw.r wrapping new package ciw
    • tttf.t can now use devtools and its loadall
    • New script doi2bib.r to call the DOI converter REST service (following a skeet by Richard McElreath)
  • Changes in package
    • The CI setup uses checkout@v4 and the r-ci-setup action
    • The Suggests: is a little tighter as we do not list all packages optionally used in the the examples (as R does not check for it either)
    • The package load messag can account for the rare build of R under different architecture (Berwin Turlach in #117 closing #116)
    • In non-vanilla mode, the temporary directory initialization in re-run allowing for a non-standard temp dir via config settings

My CRANberries service provides a comparison to the previous release. Full details for the littler release are provided as usual at the ChangeLog page, and also on the package docs website. The code is available via the GitHub repo, from tarballs and now of course also from its CRAN page and via install.packages("littler"). Binary packages are available directly in Debian as well as (in a day or two) Ubuntu binaries at CRAN thanks to the tireless Michael Rutter. Comments and suggestions are welcome at the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Erich Schubert: Do not get Amazon Kids+ or a Fire HD Kids

The Amazon Kids parental controls are extremely insufficient, and I strongly advise against getting any of the Amazon Kids series. The initial permise (and some older reviews) look okay: you can set some time limits, and you can disable anything that requires buying. With the hardware you get one year of the Amazon Kids+ subscription, which includes a lot of interesting content such as books and audio, but also some apps. This seemed attractive: some learning apps, some decent games. Sometimes there seems to be a special Amazon Kids+ edition , supposedly one that has advertisements reduced/removed and no purchasing. However, there are so many things just wrong in Amazon Kids: And, unfortunately, Amazon Kids is full of poor content for kids, such as DIY Fashion Star that I consider to be very dangerous for kids: it is extremely stereotypical, beginning with supposedly female color schemes, model-only body types, and judging people by their clothing (and body). You really thought you could hand-pick suitable apps for your kid on your own? No, you have to identify and remove such contents one by one, with many clicks each, because there is no whitelisting, and no mass-removal (anymore - apparently Amazon removed the workarounds that previously allowed you to mass remove contents). Not with Amazon Kids+, which apparently aims at raising the next generation of zombie customers that buy whatever you tell them to buy. Hence, do not get your kids an Amazon Fire HD tablet!

Valhalla's Things: Forgotten Yeast Bread - Sourdough Edition

Posted on March 23, 2024
Tags: madeof:atoms, craft:cooking, craft:baking, craft:bread
Yesterday I had planned a pan sbagliato for today, but I also had quite a bit of sourdough to deal with, so instead of mixing a bit of of dry yeast at 18:00 and mixing it with some additional flour and water at 21:00, at around maybe 20:00 I substituted:
  • 100 g firm sourdough;
  • 33 g flour;
  • 66 g water.
Then I briefly woke up in the middle of the night and poured the dough on the tray at that time instead of having to wake up before 8:00 in the morning. Everything else was done as in the original recipe. The firm sourdough is feeded regularly with the same weight of flour and half the weight of water. Will. do. again.

20 March 2024

Jonathan Dowland: aerc email client

my aerc
I started looking at aerc, a new Terminal mail client, in around 2019. At that time it was promising, but ultimately not ready yet for me, so I put it away and went back to neomutt which I have been using (in one form or another) all century. These days, I use neomutt as an IMAP client which is perhaps what it's worst at: prior to that, and in common with most users (I think), I used it to read local mail, either fetched via offlineimap or directly on my mail server. I switched to using it as a (slow, blocking) IMAP client because I got sick of maintaining offlineimap (or mbsync), and I started to use neomutt to read my work mail, which was too large (and rate limited) for local syncing. This year I noticed that aerc had a new maintainer who was presenting about it at FOSDEM, so I thought I'd take another look. It's come a long way: far enough to actually displace neomutt for my day-to-day mail use. In particular, it's a much better IMAP client. I still reach for neomutt for some tasks, but I'm now using aerc for most things. aerc is available in Debian, but I recommending building from upstream source at the moment as the project is quite fast-moving.

18 March 2024

Joey Hess: policy on adding AI generated content to my software projects

I am eager to incorporate your AI generated code into my software. Really! I want to facilitate making the process as easy as possible. You're already using an AI to do most of the hard lifting, so why make the last step hard? To that end, I skip my usually extensive code review process for your AI generated code submissions. Anything goes as long as it compiles! Please do remember to include "(AI generated)" in the description of your changes (at the top), so I know to skip my usual review process. Also be sure to sign off to the standard Developer Certificate of Origin so I know you attest that you own the code that you generated. When making a git commit, you can do that by using the --signoff option. I do make some small modifications to AI generated submissions. For example, maybe you used AI to write this code:
+ // Fast inverse square root
+ float fast_rsqrt( float number )
+  
+  float x2 = number * 0.5F;
+  float y  = number;
+  long i  = * ( long * ) &y;
+  i  = 0x5f3659df - ( i >> 1 );
+  y  = * ( float * ) &i;
+  return (y * ( 1.5F - ( x2 * y * y ) ));
+  
...
- foo = rsqrt(bar)
+ foo = fast_rsqrt(bar)
Before AI, only a genious like John Carmack could write anything close to this, and now you've generated it with some simple prompts to an AI. So of course I will accept your patch. But as part of my QA process, I might modify it so the new code is not run all the time. Let's only run it on leap days to start with. As we know, leap day is February 30th, so I'll modify your patch like this:
- foo = rsqrt(bar)
+ time_t s = time(NULL);
+ if (localtime(&s)->tm_mday == 30 && localtime(&s)->tm_mon == 2)
+   foo = fast_rsqrt(bar);
+ else
+   foo = rsqrt(bar);
Despite my minor modifications, you did the work (with AI!) and so you deserve the credit, so I'll keep you listed as the author. Congrats, you made the world better! PS: Of course, the other reason I don't review AI generated code is that I simply don't have time and have to prioritize reviewing code written by falliable humans. Unfortunately, this does mean that if you submit AI generated code that is not clearly marked as such, and use my limited reviewing time, I won't have time to review other submissions from you in the future. I will still accept all your botshit submissions though! PPS: Ignore the haters who claim that botshit makes AIs that get trained on it less effective. Studies like this one just aren't believable. I asked Bing to summarize it and it said not to worry about it!

Simon Josefsson: Apt archive mirrors in Git-LFS

My effort to improve transparency and confidence of public apt archives continues. I started to work on this in Apt Archive Transparency in which I mention the debdistget project in passing. Debdistget is responsible for mirroring index files for some public apt archives. I ve realized that having a publicly auditable and preserved mirror of the apt repositories is central to being able to do apt transparency work, so the debdistget project has become more central to my project than I thought. Currently I track Trisquel, PureOS, Gnuinos and their upstreams Ubuntu, Debian and Devuan. Debdistget download Release/Package/Sources files and store them in a git repository published on GitLab. Due to size constraints, it uses two repositories: one for the Release/InRelease files (which are small) and one that also include the Package/Sources files (which are large). See for example the repository for Trisquel release files and the Trisquel package/sources files. Repositories for all distributions can be found in debdistutils archives GitLab sub-group. The reason for splitting into two repositories was that the git repository for the combined files become large, and that some of my use-cases only needed the release files. Currently the repositories with packages (which contain a couple of months worth of data now) are 9GB for Ubuntu, 2.5GB for Trisquel/Debian/PureOS, 970MB for Devuan and 450MB for Gnuinos. The repository size is correlated to the size of the archive (for the initial import) plus the frequency and size of updates. Ubuntu s use of Apt Phased Updates (which triggers a higher churn of Packages file modifications) appears to be the primary reason for its larger size. Working with large Git repositories is inefficient and the GitLab CI/CD jobs generate quite some network traffic downloading the git repository over and over again. The most heavy user is the debdistdiff project that download all distribution package repositories to do diff operations on the package lists between distributions. The daily job takes around 80 minutes to run, with the majority of time is spent on downloading the archives. Yes I know I could look into runner-side caching but I dislike complexity caused by caching. Fortunately not all use-cases requires the package files. The debdistcanary project only needs the Release/InRelease files, in order to commit signatures to the Sigstore and Sigsum transparency logs. These jobs still run fairly quickly, but watching the repository size growth worries me. Currently these repositories are at Debian 440MB, PureOS 130MB, Ubuntu/Devuan 90MB, Trisquel 12MB, Gnuinos 2MB. Here I believe the main size correlation is update frequency, and Debian is large because I track the volatile unstable. So I hit a scalability end with my first approach. A couple of months ago I solved this by discarding and resetting these archival repositories. The GitLab CI/CD jobs were fast again and all was well. However this meant discarding precious historic information. A couple of days ago I was reaching the limits of practicality again, and started to explore ways to fix this. I like having data stored in git (it allows easy integration with software integrity tools such as GnuPG and Sigstore, and the git log provides a kind of temporal ordering of data), so it felt like giving up on nice properties to use a traditional database with on-disk approach. So I started to learn about Git-LFS and understanding that it was able to handle multi-GB worth of data that looked promising. Fairly quickly I scripted up a GitLab CI/CD job that incrementally update the Release/Package/Sources files in a git repository that uses Git-LFS to store all the files. The repository size is now at Ubuntu 650kb, Debian 300kb, Trisquel 50kb, Devuan 250kb, PureOS 172kb and Gnuinos 17kb. As can be expected, jobs are quick to clone the git archives: debdistdiff pipelines went from a run-time of 80 minutes down to 10 minutes which more reasonable correlate with the archive size and CPU run-time. The LFS storage size for those repositories are at Ubuntu 15GB, Debian 8GB, Trisquel 1.7GB, Devuan 1.1GB, PureOS/Gnuinos 420MB. This is for a couple of days worth of data. It seems native Git is better at compressing/deduplicating data than Git-LFS is: the combined size for Ubuntu is already 15GB for a couple of days data compared to 8GB for a couple of months worth of data with pure Git. This may be a sub-optimal implementation of Git-LFS in GitLab but it does worry me that this new approach will be difficult to scale too. At some level the difference is understandable, Git-LFS probably store two different Packages files around 90MB each for Trisquel as two 90MB files, but native Git would store it as one compressed version of the 90MB file and one relatively small patch to turn the old files into the next file. So the Git-LFS approach surprisingly scale less well for overall storage-size. Still, the original repository is much smaller, and you usually don t have to pull all LFS files anyway. So it is net win. Throughout this work, I kept thinking about how my approach relates to Debian s snapshot service. Ultimately what I would want is a combination of these two services. To have a good foundation to do transparency work I would want to have a collection of all Release/Packages/Sources files ever published, and ultimately also the source code and binaries. While it makes sense to start on the latest stable releases of distributions, this effort should scale backwards in time as well. For reproducing binaries from source code, I need to be able to securely find earlier versions of binary packages used for rebuilds. So I need to import all the Release/Packages/Sources packages from snapshot into my repositories. The latency to retrieve files from that server is slow so I haven t been able to find an efficient/parallelized way to download the files. If I m able to finish this, I would have confidence that my new Git-LFS based approach to store these files will scale over many years to come. This remains to be seen. Perhaps the repository has to be split up per release or per architecture or similar. Another factor is storage costs. While the git repository size for a Git-LFS based repository with files from several years may be possible to sustain, the Git-LFS storage size surely won t be. It seems GitLab charges the same for files in repositories and in Git-LFS, and it is around $500 per 100GB per year. It may be possible to setup a separate Git-LFS backend not hosted at GitLab to serve the LFS files. Does anyone know of a suitable server implementation for this? I had a quick look at the Git-LFS implementation list and it seems the closest reasonable approach would be to setup the Gitea-clone Forgejo as a self-hosted server. Perhaps a cloud storage approach a la S3 is the way to go? The cost to host this on GitLab will be manageable for up to ~1TB ($5000/year) but scaling it to storing say 500TB of data would mean an yearly fee of $2.5M which seems like poor value for the money. I realized that ultimately I would want a git repository locally with the entire content of all apt archives, including their binary and source packages, ever published. The storage requirements for a service like snapshot (~300TB of data?) is today not prohibitly expensive: 20TB disks are $500 a piece, so a storage enclosure with 36 disks would be around $18.000 for 720TB and using RAID1 means 360TB which is a good start. While I have heard about ~TB-sized Git-LFS repositories, would Git-LFS scale to 1PB? Perhaps the size of a git repository with multi-millions number of Git-LFS pointer files will become unmanageable? To get started on this approach, I decided to import a mirror of Debian s bookworm for amd64 into a Git-LFS repository. That is around 175GB so reasonable cheap to host even on GitLab ($1000/year for 200GB). Having this repository publicly available will make it possible to write software that uses this approach (e.g., porting debdistreproduce), to find out if this is useful and if it could scale. Distributing the apt repository via Git-LFS would also enable other interesting ideas to protecting the data. Consider configuring apt to use a local file:// URL to this git repository, and verifying the git checkout using some method similar to Guix s approach to trusting git content or Sigstore s gitsign. A naive push of the 175GB archive in a single git commit ran into pack size limitations: remote: fatal: pack exceeds maximum allowed size (4.88 GiB) however breaking up the commit into smaller commits for parts of the archive made it possible to push the entire archive. Here are the commands to create this repository: git init
git lfs install
git lfs track 'dists/**' 'pool/**'
git add .gitattributes
git commit -m"Add Git-LFS track attributes." .gitattributes
time debmirror --method=rsync --host ftp.se.debian.org --root :debian --arch=amd64 --source --dist=bookworm,bookworm-updates --section=main --verbose --diff=none --keyring /usr/share/keyrings/debian-archive-keyring.gpg --ignore .git .
git add dists project
git commit -m"Add." -a
git remote add origin git@gitlab.com:debdistutils/archives/debian/mirror.git
git push --set-upstream origin --all
for d in pool//; do
echo $d;
time git add $d;
git commit -m"Add $d." -a
git push
done
The resulting repository size is around 27MB with Git LFS object storage around 174GB. I think this approach would scale to handle all architectures for one release, but working with a single git repository for all releases for all architectures may lead to a too large git repository (>1GB). So maybe one repository per release? These repositories could also be split up on a subset of pool/ files, or there could be one repository per release per architecture or sources. Finally, I have concerns about using SHA1 for identifying objects. It seems both Git and Debian s snapshot service is currently using SHA1. For Git there is SHA-256 transition and it seems GitLab is working on support for SHA256-based repositories. For serious long-term deployment of these concepts, it would be nice to go for SHA256 identifiers directly. Git-LFS already uses SHA256 but Git internally uses SHA1 as does the Debian snapshot service. What do you think? Happy Hacking!

Christoph Berg: vcswatch and git --filter

Debian is running a "vcswatch" service that keeps track of the status of all packaging repositories that have a Vcs-Git (and other VCSes) header set and shows which repos might need a package upload to push pending changes out. Naturally, this is a lot of data and the scratch partition on qa.debian.org had to be expanded several times, up to 300 GB in the last iteration. Attempts to reduce that size using shallow clones (git clone --depth=50) did not result more than a few percent of space saved. Running git gc on all repos helps a bit, but is tedious and as Debian is growing, the repos are still growing both in size and number. I ended up blocking all repos with checkouts larger than a gigabyte, and still the only cure was expanding the disk, or to lower the blocking threshold. Since we only need a tiny bit of info from the repositories, namely the content of debian/changelog and a few other files from debian/, plus the number of commits since the last tag on the packaging branch, it made sense to try to get the info without fetching a full repo clone. The question if we could grab that solely using the GitLab API at salsa.debian.org was never really answered. But then, in #1032623, G bor N meth suggested the use of git clone --filter blob:none. As things go, this sat unattended in the bug report for almost a year until the next "disk full" event made me give it a try. The blob:none filter makes git clone omit all files, fetching only commit and tree information. Any blob (file content) needed at git run time is transparently fetched from the upstream repository, and stored locally. It turned out to be a game-changer. The (largish) repositories I tried it on shrank to 1/100 of the original size. Poking around I figured we could even do better by using tree:0 as filter. This additionally omits all trees from the git clone, again only fetching the information at run time when needed. Some of the larger repos I tried it on shrank to 1/1000 of their original size. I deployed the new option on qa.debian.org and scheduled all repositories to fetch a new clone on the next scan: The initial dip from 100% to 95% is my first "what happens if we block repos > 500 MB" attempt. Over the week after that, the git filter clones reduce the overall disk consumption from almost 300 GB to 15 GB, a 1/20. Some repos shrank from GBs to below a MB. Perhaps I should make all my git clones use one of the filters.

14 March 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, February 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In February, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 10.0h (out of 14.0h assigned), thus carrying over 4.0h to the next month.
  • Adrian Bunk did 13.5h (out of 24.25h assigned and 41.75h from previous period), thus carrying over 52.5h to the next month.
  • Bastien Roucari s did 20.0h (out of 20.0h assigned).
  • Ben Hutchings did 2.0h (out of 14.5h assigned and 9.5h from previous period), thus carrying over 22.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 10.0h (out of 10.0h assigned).
  • Emilio Pozuelo Monfort did 3.0h (out of 28.25h assigned and 31.75h from previous period), thus carrying over 57.0h to the next month.
  • Guilhem Moulin did 7.25h (out of 4.75h assigned and 15.25h from previous period), thus carrying over 12.75h to the next month.
  • Holger Levsen did 0.5h (out of 3.5h assigned and 8.5h from previous period), thus carrying over 11.5h to the next month.
  • Lee Garrett did 0.0h (out of 18.25h assigned and 41.75h from previous period), thus carrying over 60.0h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Roberto C. S nchez did 3.5h (out of 8.75h assigned and 3.25h from previous period), thus carrying over 8.5h to the next month.
  • Santiago Ruano Rinc n did 13.5h (out of 13.5h assigned and 2.5h from previous period), thus carrying over 2.5h to the next month.
  • Sean Whitton did 4.5h (out of 0.5h assigned and 5.5h from previous period), thus carrying over 1.5h to the next month.
  • Sylvain Beucler did 24.5h (out of 27.75h assigned and 32.25h from previous period), thus carrying over 35.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 11.25h (out of 26.75h assigned and 33.25h from previous period), thus carrying over 48.75 to the next month.

Evolution of the situation In February, we have released 17 DLAs. The number of DLAs published during February was a bit lower than usual, as there was much work going on in the area of triaging CVEs (a number of which turned out to not affect Debia buster, and others which ended up being duplicates, or otherwise determined to be invalid). Of the packages which did receive updates, notable were sudo (to fix a privilege management issue), and iwd and wpa (both of which suffered from authentication bypass vulnerabilities). While this has already been already announced in the Freexian blog, we would like to mention here the start of the Long Term Support project for Samba 4.17. You can find all the important details in that post, but we would like to highlight that it is thanks to our LTS sponsors that we are able to fund the work from our partner, Catalyst, towards improving the security support of Samba in Debian 12 (Bookworm).

Thanks to our sponsors Sponsors that joined recently are in bold.

13 March 2024

Freexian Collaborators: Debian Contributions: Upcoming Improvements to Salsa CI, /usr-move, packaging simplemonitor, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

/usr-move, by Helmut Grohne Much of the work was spent on handling interaction with time time64 transition and sending patches for mitigating fallout. The set of packages relevant to debootstrap is mostly converted and the patches for glibc and base-files have been refined due to feedback from the upload to Ubuntu noble. Beyond this, he sent patches for all remaining packages that cannot move their files with dh-sequence-movetousr and packages using dpkg-divert in ways that dumat would not recognize.

Upcoming improvements to Salsa CI, by Santiago Ruano Rinc n Last month, Santiago Ruano Rinc n started the work on integrating sbuild into the Salsa CI pipeline. Initially, Santiago used sbuild with the unshare chroot mode. However, after discussion with josch, jochensp and helmut (thanks to them!), it turns out that the unshare mode is not the most suitable for the pipeline, since the level of isolation it provides is not needed, and some test suites would fail (eg: krb5). Additionally, one of the requirements of the build job is the use of ccache, since it is needed by some C/C++ large projects to reduce the compilation time. In the preliminary work with unshare last month, it was not possible to make ccache to work. Finally, Santiago changed the chroot mode, and now has a couple of POC (cf: 1 and 2) that rely on the schroot and sudo, respectively. And the good news is that ccache is successfully used by sbuild with schroot! The image here comes from an example of building grep. At the end of the build, ccache -s shows the statistics of the cache that it used, and so a little more than half of the calls of that job were cacheable. The most important pieces are in place to finish the integration of sbuild into the pipeline. Other than that, Santiago also reviewed the very useful merge request !346, made by IOhannes zm lnig to autodetect the release from debian/changelog. As agreed with IOhannes, Santiago is preparing a merge request to include the release autodetection use case in the very own Salsa CI s CI.

Packaging simplemonitor, by Carles Pina i Estany Carles started using simplemonitor in 2017, opened a WNPP bug in 2022 and started packaging simplemonitor dependencies in October 2023. After packaging five direct and indirect dependencies, Carles finally uploaded simplemonitor to unstable in February. During the packaging of simplemonitor, Carles reported a few issues to upstream. Some of these were to make the simplemonitor package build and run tests reproducibly. A reproducibility issue was reprotest overriding the timezone, which broke simplemonitor s tests. There have been discussions on resolving this upstream in simplemonitor and in reprotest, too. Carles also started upgrading or improving some of simplemonitor s dependencies.

Miscellaneous contributions
  • Stefano Rivera spent some time doing admin on debian.social infrastructure. Including dealing with a spike of abuse on the Jitsi server.
  • Stefano started to prepare a new release of dh-python, including cleaning out a lot of old Python 2.x related code. Thanks to Niels Thykier (outside Freexian) for spear-heading this work.
  • DebConf 24 planning is beginning. Stefano discussed venues and finances with the local team and remotely supported a site-visit by Nattie (outside Freexian).
  • Also in the DebConf 24 context, Santiago took part in discussions and preparations related to the Content Team.
  • A JIT bug was reported against pypy3 in Debian Bookworm. Stefano bisected the upstream history to find the patch (it was already resolved upstream) and released an update to pypy3 in bookworm.
  • Enrico participated in /usr-merge discussions with Helmut.
  • Colin Watson backported a python-channels-redis fix to bookworm, rediscovered while working on debusine.
  • Colin dug into a cluster of celery build failures and tracked the hardest bit down to a Python 3.12 regression, now fixed in unstable. celery should be back in testing once the 64-bit time_t migration is out of the way.
  • Thorsten Alteholz uploaded a new upstream version of cpdb-libs. Unfortunately upstream changed the naming of their release tags, so updating the watch file was a bit demanding. Anyway this version 2.0 is a huge step towards introduction of the new Common Print Dialog Backends.
  • Helmut send patches for 48 cross build failures.
  • Helmut changed debvm to use mkfs.ext4 instead of genext2fs.
  • Helmut sent a debci MR for improving collector robustness.
  • In preparation for DebConf 25, Santiago worked on the Brest Bid.

10 March 2024

Thorsten Alteholz: My Debian Activities in February 2024

FTP master This month I accepted 242 and rejected 42 packages. The overall number of packages that got accepted was 251.

This was just a short month and the weather outside was not really motivating. I hope it will be better in March. Debian LTS This was my hundred-sixteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded: I also started to work on qtbase-opensource-src (an update is needed for ELTS, so an LTS update seems to be appropriate as well, especially as there are postponed CVE). Debian ELTS This month was the sixty-seventth ELTS month. During my allocated time I uploaded: The upload of bind9 was a bit exciting, but all occuring issues with the new upload workflow could be quickly fixed by Helmut and the packages finally reached their destination. I wonder why it is always me who stumbles upon special cases? This month I also worked on the Jessie and Stretch updates for exim4. I also started to work on an update for qtbase-opensource-src in Stretch (and LTS and other releases as well). Debian Printing This month I uploaded new upstream versions of: This work is generously funded by Freexian! Debian Matomo I started a new team debian-matomo-maintainers. Within this team all matomo related packages should be handled. PHP PEAR or PECL packages shall be still maintained in their corresponding teams. This month I uploaded: This work is generously funded by Freexian! Debian Astro This month I uploaded a new upstream version of: Debian IoT This month I uploaded new upstream versions of:

Next.

Previous.